343 research outputs found

    Forward and bidirectional planning based on reinforcement learning and neural networks in a simulated robot.

    Get PDF
    Building intelligent systems that are capable of learning, acting reactively and planning actions before their execution is a major goal of artificial intelligence. This paper presents two reactive and planning systems that contain important novelties with respect to previous neural-network planners and reinforcement- learning based planners: (a) the introduction of a new component (?matcher?) allows both planners to execute genuine taskable planning (while previous reinforcement-learning based models have used planning only for speeding up learning); (b) the planners show for the first time that trained neural- network models of the world can generate long prediction chains that have an interesting robustness with regards to noise; (c) two novel algorithms that generate chains of predictions in order to plan, and control the flows of information between the systems? different neural components, are presented; (d) one of the planners uses backward ?predictions? to exploit the knowledge of the pursued goal; (e) the two systems presented nicely integrate reactive behavior and planning on the basis of a measure of ?confidence? in action. The soundness and potentialities of the two reactive and planning systems are tested and compared with a simulated robot engaged in a stochastic path-finding task. The paper also presents an extensive literature review on the relevant issues

    A system-level neural model of the brain mechanisms underlying instrumental devaluation in rats

    Get PDF
    Goal-directed behaviours are defined by the presence of two kinds of effect on instrumental learning. First, degrading the contingencies between produced actions and desired outcomes diminishes the number of instrumental responses; second, devaluing a reward results in a lower production of instrumental actions to obtain it. We present a computational model of the neural processes underlying instrumental devaluation in rats. The model reproduces the interaction between the basolateral complex of the amygdala (BLA) and the limbic, associative and somatosensory striato-cortical loops. Firing-rate units are used to abstract the activity features of neural populations. Learning is reproduced through the use of dopamine-dependent simple and differential hebbian rules. Constraints from anatomy of the projections between neural systems are taken into account. The central hypothesis implemented in the model is that pavlovian associations learned within the BLA between manipulanda and rewards modulate goal selection through the activation of the nucleus accumbens core (NaccCo). Selection processes happening in the limbic basal ganglia with the activation of the NaccCo decide which outcome is choosen as a goal within the prelimbic cortex (PL). Connections between the BLA and the NaccCo are learned through hebbian associations mediated by feedbacks from the PL to the NaccCo. Information about selected goals from the limbic striato-cortical loop influences action selection in the sensorimotor loop both through cortico-cortical projections and through a striato-nigro-striatal dopaminergic pathway passing through the associative striato-cortical loop. The model is tested as part of the control system of a simulated rat. Instrumental devaluation tasks are reproduced. Simulated lesions of the BLA, the NaccCo, the PL and the dorsomedial striatum (DMS) both before and after training reproduce the behavioural effect of lesions in real rats. The model provides predictions about the effects of still undocumented lesions

    Simulations of oligopolistic markets with artificial agents: Decision procedures as emergent properties of adaptive learning

    Get PDF
    While economic models of strategic interaction among autonomous decision-makers are usually based upon principles of optimisation, this work focuses on "satisfying" decision procedures. A flexible simulator of oligopolistic economic environment, where autonomous decision-makers evolve their decision procedures using a learning and adaptation process, has been built. Each artificial agent is implemented using a feedforward neural network. The unsupervised learning of the agent is obtained using genetic algorithms which evolve the structure and weights of the neural network during simulations. The obtained results show that as the complexity of the environment overwhelms the cognitive abilities of the agents, decision procedures emerge that are at the same time simple robust and "satisfying"

    A planning modular neural-network robot for asynchronous multi-goal navigation tasks

    Get PDF
    This paper focuses on two planning neural-network controllers, a "forward planner" and a "bidirectional planner". These have been developed within the framework of Sutton\u27s Dyna-PI architectures (planning within reinforcement learning) and have already been presented in previous papers. The novelty of this paper is that the architecture of these planners is made modular in some of its components in order to deal with catastrophic interference. The controllers are tested through a simulated robot engaged in an asynchronous multi-goal path-planning problem that should exacerbate the interference problems. The results show that: (a) the modular planners can cope with multi-goal problems allowing generalisation but avoiding interference; (b) when dealing with multi-goal problems the planners keeps the advantages shown previously for one-goal problems vs. sheer reinforcement learning; (c) the superiority of the bidirectional planner vs. the forward planner is confirmed for the multi-goal task

    Apprendimento per rinforzo e coordinazione sensomotoria

    Get PDF
    No abstract availableIl lavoro presenta un modello sulla coordinazione sensomotoria relativa al raggiungimento manuale di oggetti. Lo scopo ? quello di valutare se un sistema subsimbolico non caratterizzato a priori da limiti e vincoli in termini di connettivit?, sia in grado di far emergere, attraverso processi di assimilazione e accomodamento, schemi cognitivi percettivi e motori adattivi. A questo fine ? stato realizzato un organismo simulato dotato di un braccio e di un sistema visivo, che percepisce un rinforzo positivo quando avvicina l\u27estremit? del braccio ad un oggetto. Il sistema sensomotorio ? coordinato da una rete neurale artificiale binaria che va incontro a due processi di apprendimento: modificazione della probabilit? di attivazione delle unit? sulla base del rinforzo e modificazione della topologia della rete tramite un algoritmo genetico. I risultati mostrano che mentre il primo processo di apprendimento consente all\u27organismo di svolgere efficacemente il compito, l\u27algoritmo genetico non riesce a far emergere nel tempo una struttura che supporti schemi sensomotori sufficientemente selettivi e specializzati

    A modular neural-network model of the basal ganglia\u27s role in learning and selecting motor behaviours

    Get PDF
    This work presents a modular neural-network model (based on reinforcement-learning actor-critic methods) that tries to capture some of the most relevant known aspects of the role that basal ganglia play in learning and selecting motor behavior related to different goals. In particular some simulations with the model show that basal ganglia selects "chunks" of behaviour whose "details" are specified by direct sensory-motor pathways, and how emergent modularity can help to deal with multiple behavioral tasks. A "top-down" approach is adopted. The starting point is the adaptive interaction of a (simulated) organism with the environment, and its capacity to learn. Then an attempt is made to implement these functions with neural architectures and mechanisms that have a neuroanatomical and neurophysiological empirical foundation

    Analisi quantitative dell\u27auto-organizzazione in gruppi di robot simulati

    Get PDF
    No abstract availableUn filone importante delle ricerche in robotica collettiva autonoma (Grabowski et al., 2003) si focalizza sul problema di come il coordinamento di gruppi di robot possa avvenire in modo distribuito, cio? senza il coordinamento centralizzato da parte di robot ?leader? (Baldassarre et al., 2003). Per realizzare il coordinamento distribuito ci si basa solitamente su meccanismi di auto-organizzazione (Camazine et al., 2001) basati sulle interazioni e comunicazioni locali dei robot. Un limite di questo tipo di ricerche ? che solitamente i meccanismi di auto-organizzazione implementati (o emergenti, nel caso in cui il controllo viene sviluppato con tecniche di ricerca automatiche tipo gli algoritmi genetici come avviene in robotica evolutiva, Floreano e Nolfi, 2001) non vengono n? indicati con precisione, n? descritti in modo quantitativo (cf. ad es. Holland e Melhuish, 1999; Ijspeert et al., 2001; Kube and Bonabeau, 1998; Martinoli, 1999). Questa ricerca analizza dei risultati presentati in precedenti lavori (Baldassarre et al., 2003; Baldassarre et al., 2004) relativi a gruppi di robot agganciati fisicamente tra loro ed evoluti con degli algoritmi genetici in modo da muoversi in modo coordinato nello spazio. L?obiettivo dell?analisi ? stato identificare con precisione ed analizzare quantitativamente i meccanismi di auto-organizzazione emersi durante l?evoluzione ed utilizzati dai robot per coordinarsi in modo distribuito
    • …
    corecore